Goto

Collaborating Authors

 caseless model


Stanford CoreNLP

@machinelearnbot

If your text is all lowercase, all uppercase, or badly and inconsistently capitalized (many web forums, texts, twitter, etc.) then this will negatively effect the performance of most of our annotators. Most of our annotators were trained on data that is standardly edited and capitalized full sentences. There are two strategies available to address this that may help. One is to try to first correctly capitalize the text with a truecaser, and then to process the text with the standard models. See the TrueCaseAnnotator for how to do this. The other strategy is to use models more suited to ill-capitalized text.